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Introduction:  Bivariate  (and  multivariate)  asymptotic  distributions  of  extremes 
are  useful  for  dealing  with  many  concrete  problems  as  the  largest  ages  of  death 
for  men  and  women,  whose  distribution  naturally  splits  in  *he  product  of  the  mar¬ 
gins,  by  independence;  the  floods  (or  droughts)  at  two  different  places  of  the 
same  river,  each  year;  bivariate  (or  multivariate)  extreme  meteorological  data 
(pressure  ,  temperature,  wind  velocity,  etc.)  each  week;  largest  waves  each  week, 
etc . 


Evidently,  the  target  of  a  study  of  asymptotic  distributions  of  bivariate 
extremes  is  to  obtain  asymptotic  probabilistic  behavior  and  also,  to  provide  bi¬ 
variate  models  of  (asymptotic)  extremes  that  fit  observed  data.  It  can  be  said 
that,  although  some  problems  are  solved,  the  methods  found  until  now  cover  much 
less  area  than  the  theory  for  extremes,  which  itself  may  be  said  to  be  in  ado¬ 
lescence  but  not,  yet,  in  adult  age.  Thus  bivariate  extremes  may  be,  now,  at  the 
end  of  infancy  and  multivariate  extremes  are  yet  even  younger!  It  will  be  seen 
that,  in  many  cases,  we  do  not  have  the  best  test,  the  best  estimation  procedure, 
etc.  -  although  we  have  one  -  and  in  other  cases  nothing  at  all  is  known.  An 
example:  the  separation  of  the  bivariate  extreme  models  -  so  important  for  appli¬ 
cations  -  is  only  now  being  considered!  In  general,  the  few  papers  up  to  now 
choose  one  model  from  the  beginning  or  compare  two  of  them  by  the  use  of 
Kolmogoroff -Smirnov  or  another  test,  see,  for  instance,  Gumbel  and  Goldstein 
(1964)  . 

Let  us  briefly  recall  the  basic  ideas  relating  to  univariate  extremes.  Let 
{Xn}  be  a  sequence  of  i.i.d.  random  variables  with  distribution  function  F(x). 

Then  Prob{max  (X^,...,  Xfi)  <  x}  =  Fn(x).  We  can  ask  whether  there  exist  sequences 
of  attraction  coefficients  {(X  ,  <5  )}  (6  >  0)  such  that 


max(Xj, . . . ,  Xn)  -  Xn 
- 

n 


<  x}  =  Fn(X  +6 
v  n  n 


x) 


Prob  { 


2 


has  a  non-degenerate  (weak)  limiting  distribution  function  L(x).  It  is  well 

known  that,  by  Khintchine's  theorem  on  convergence  of  types,  that  if  this  happens 

the  sequence  (On,<5n)i  is  not  unique  and  an  equivalent  sequence,  (X  ,7^),  i.e. 

leading  to  the  same  limiting  distribution  L(x)  is  such  that  (X  -X  )/S  -+  0  and 

n  n  n 

T/'„  *  1.  With  a  convenient  choice  of  {(X  ,  A  )>  we  have  the  reduced  for  stan- 
n  n  n  n 

dard)  forms 

‘  <  z)  -  if  -:<■  <  z  <  0 

=1  if  0  <  Z  <  ;  a  >  0  ; 

_  £ 

(z)  =  expf-e  )  ,  -a>  <  z  <  +or  and 

:  (z)  =  0  if  -°°  <  z  <  0 
ot 

-z'u 

=  e  if  0  <  z  <  +°°  ,  a  >  0. 

The  basic  paper  is  Gnedenko  (1943),  with  previous  ones  of  Fisher-Tippett  (1928), 
Frdchet  (1927),  Gumbel  (1935)  and  con  Mises  (1935).  For  more  details  see  Gumbel 
(1958)  . 

y  ,  A  and  ■>  are  called,  respectively,  Weibull,  Gumbel  and  Fiechet  (standard) 

c»  ci 

distributions;  in  practical  applications  we  have  to  introduce  location  and  disper¬ 
sion  parameters. 

It  is  evident  that  by  logarithmic  transformations  we  can  reduce  Weibull  and 
Frechet  fo  ms  to  the  Gumbel  one,  so  that,  for  theoretical  study  -  although  not 
for  practical  applications  -  we  can  concentrate  on  the  Gumbel  limiting  form,  as 
will  be  done  in  the  next  sections. 

Note  that  as  maxfXj,...,  xn)  =  -minf-Xj , . . . ,  -xn)  the  analogous  limiting 
forms  for  minima  are  1  -  L(-z),  i.e.,  1  -  ^(-z),  1  -  A(-z)  and  1  -  T^f-z). 

Let  us  recall,  as  it  will  be  useful  in  the  sequel,  that  for  the  Gumbel  dis¬ 
tribution  A(z)  the  mean  value  of  =  y  *  0.57722  (Euler  constant),  the  variance 


3 


2  2  3 

1^2  =  0  =  it  /6,  the  skewness  coefficient  ) j^/o  =  1.1396  and  the  kurtosis 

4 

coefficient  y^/o  =  5.4. 


Asymptotic  behavior  of  bivariate  maxima:  Consider  now  a  sequence  of  i . i . d .  ran¬ 
dom  pairs  {(X  ,Y  )J  with  distribution  function  F(x,y).  Analogously 
Problmaxfx^ , . . . ,  x^)  <  x,  max(y  , ....  y  )  <  y  =  Fn(x,y).  We  can  seek  a  pair 

ot  sequences  {(X  ,6  ),(A',6')}  such  that 
n  n  n  n 


max(x, ,  .  . . ,  x  )  -A 
n  if  1  n  n  , 

Probt - * - <  x 

o 


max(y  .. .,  y  )-A» 

- - — - - - -  <  y}  =  Fn(  A  +6  x,  A'  +  6’  y) 

o'  n  n  n  n  J 

n 


do  have  a  (weak)  limiting  non-degenerate  distribution  function  L(x,y).  If  this 
happens  the  Boole-Frechet  inequality  shows  that  the  margins  also  have  (weak) 
limiting  distributions  of  the  marginal  maxima  and,  thus,  are  of  the  three  forms 
previously  given.  In  relation  to  what  has  been  said  before,  we  will,  from  now 
on,  suppose  that  the  limiting  distilbutions  of  the  margins  are  of  Gumbel  form: 

L(x1+«)  =  A(x)  ,  L(+°°,y)  =  A(y)  . 


Using  Khintchine's  theorem,  as  is  done  for  the  univariate  case  and  imposing 
r  bel  margins  we  can  show  that  L(x,y)  must  satisfy  the  (stability)  relation 

Lk(x,y)  =  L(x-log  k,y-log  k) 

for  an  integer  k  positive.  Passing  from  the  positive  integer  k  to  rational 
r(>0)  and  finally  to  real  t(>0)  we  get 

Lt(x,y)  =  L(x-log  t,y-log  t) 

-x 

Taking  now  x  =  log  t  we  have  l(x,y)  =  Le  (0,y-x) .  Putting  now  L(0,w)  = 

-W 

exp(-(l+e  )k(w))  we  have  shown,  finally,  that  the  limiting  (and  stable)  distri¬ 
bution  of  maxima  with  Gumbel  margins  are  of  the  form 

L(x,y)  =  A(x,y)  =  exp(-(e~X+e‘y)k(y-x))  =  { A(x) A(y) }k^y'x^ 


4 


It  remains  to  study  now  the  dependence  function  k(w),  obviously  continuous 
and  non-negative,  for  A  to  be  a  distribution  function.  Those  results  are  well 
known.  They  can  be  found,  with  different  forms  of  margins  in  Finkelshteyn 
(1953),  Tiago  de  Oliveira  (1958),  Geffroy  (1958/59)  and  Sibuya  (1959)  with  a 
synthesis  of  the  results  in  Tiago  de  Oliveira  (1962/63) .  Subsequent  results 
are  in  Tiago  de  Oliveira  (1975)  and  (1980).  Galambos  (1968)  contains  a  more 
recent  account . 

A  characterization  of  the  distribution  function  A(x,y)  can  be  made  in  the 
following  way.  It  is  immediate  that  a  random  pair  (X,Y)  with  distribution 
function  A(x,y)  is  such  that  V  =  max(X+a,Y+b)  has  a  Gumbel  distribution  function 

cl  1) 

with  a  location  parameter,  i.e.,  max(X+a,Y+b)  -  (log(e  +e  )  +  log  k(a-b)}  has  a 
standard  Gumbel  distribution  function.  In  fact 

Prob{max(X+a,  Y+b) -\(a,b)  <z]  =  l-'(z+ A(a,b)  -  a,  z+A(a,b)  -b)  =  A(z) 
or 

F(z-a,z-b)  =  A(z-A(a,b)) 

If  we  put  z-a  =  p,  z-b  =  q  we  get 

F(p,q)  =  A(z-A(z-p, z-q)) 

and,  thus,  z  -  \(z-p,z-q)  is  independent  of  z.  Taking  now  z  =  q  and 
\(q-p,0)  =  log(l+e^"^)  +  log  k(q-p)  we  obtain  the  desired  form. 

Let  us  now  describe  the  dependence  function  k(w).  Although  a  continuous 
function  we  cannot  show  that  it  has  a  2nd  derivative  and  consequently  we  cannot 

expect  a  bivariate  extreme  random  pair  with  distribution  function  A (x,y)  = 

k  fv»x)  >. 

[A(x)A(y)]  7  to  have  a  planar  density.  In  fact,  from  the  Boole-Frechet 

inequality 

max  (0, A(x) +A(y) - 1)  <  A (x,y)  _  min(A(x) , A(y)) 
we  have,  replacing  x  and  y  by  x  +  log  n  and  y  +  log  n,  raising  to  the  power  n  and 


’  -  r‘W 


letting  n  ->  °°,  the  limit  inequality 

A(x)ACy)  ^  A(x,y)  <  min(A(x) , A(y) ) 
or 

exp{-(e~x+e~y) }  s  A(x,y)  <  exp(-e"m:'"n^x’y^) 

Evidently  the  upper  limit  corresponds  to  the  case  where  the  reduced  margins  pair 
(X, Y)  is  concentrated,  with  probability  1,  in  the  first  diagonal,  the  so-called 
diagonal  case,  which  is  singular;  the  lower  limit  corresponds  to  independence. 
For  the  dependence  function  we  have 

( diagonal )maX-  -,-§~w  --  -  k(w)  -  1  (independence)  . 

1  +  e 

Note  that  k(-«)  =  k(+°°)  =  1.  The  behavior  of  the  dependence  function  can  be 
described  through  the  behavior  of  the  median  line  A(x,y)  =  h  or 
(e~X+e~y)k(y-x)  =  log  2;  note  that  the  median  curve  is  always  in  the  plane  area 
defined  by  the  curve  for  the  diagonal  case  max(e  X,e"y)  =  log  2  and  the  curve 

“X  -V 

for  independence  e  +e  1  =  log  2. 

If  there  is  a  planar  density,  i.e.,  k"(w)  exists,  then  as  it  is  easily  ob¬ 
tained  by  derivation,  k(w)  must  satisfy  the  relations: 

k(-°°)  =  k(+°°)  =  1  , 

[(l+ew)k(w) ] ’  >  0 
[(l+e"w)k(w) ] '  <  0 

and 

(l+e~w)k"(w)  +  (l-e‘w)k'(w)  s  0  , 

2 

the  corresponding  conditions  for  the  general  case  being,  as  A  A(x,y)  >  0, 

y 

k(-=°)  =  k(+°°)  =  1  , 

(l+eW)k(w)  a  non-decreasing  function  , 

(l+e~w)k(w)  a  non-increasing  function  , 


6 


and  A2  [ (e~k+e"^k(y-x) ]  <  0  . 

* » y 

Some  other  properties  can  be  ascribed  to  k(w)  .  The  first  one  is  the  symmetry 
condition,  i.e.,  if  k(w)  is  a  dependence  function,  then  k(-w)  is  also  a  dependence 
function.  The  proof  is  immediate  if  we  consider  the  conditions  in  the  differenti¬ 
able  case  (where  a  planar  density  does  exist)  and  slightly  longer  in  the  general 
case.  If  k(w)  =  k(-w)  then  (X,Y)  is  an  exchangeable  pair  and  A(x,y)  =  A(y,x)  . 

Also  it  is  immediate  that  if  k^(w)  and  k2(w)  are  dependence  functions,  any 
mixture  Ok^ (w)  +  (1-0)  k^(w),  0  <  0  <  1,  is  also  a  dependence  function  .  The  set 
of  dependence  functions  is,  then,  convex.  And  this  convexity  property 

A(x,y)  =  A6j(x,y)  ♦  A^’^x.y) 

is  very  useful  in  obtaining  models:  the  mixed  model  as  well  as  the  Gumbel  model, 
are  such  examples. 

Another  method  of  generating  models  is  the  following.  Let  (X,Y)  be  an 

extreme  random  pair,  with  dependence  function  k(w)  and  standard  Gumbel  margins 

and  consider  the  new  random  pair  (X,Y)  with  X  =  max(X+a,Y+b) ,  Y  =  max(X+c,Y+d) . 

To  have  standard  Gumbel  margins  we  must  have  (ea+e^)k(a-b)  =  1  and 
c  d 

e  +e  )'k(c-d)  =  1.  Then  we  have 


k(w) 


,  max(a+w,e)  max(b+w,d) r  ,  .  ,,  n 

[e  ’  +  e  ]k[max(a+w, c)  -  max(b+w,d)  ] 

i  w 

1  +  e 

with  (a,b)  and  (c,d)  satisfying  the  conditions  written  above.  This  max-technique 
will  be  used  towards  the  end  of  the  paper  to  generate  the  biextremal  and  natural 
models. 

Let  us  stress  that  independence  has  a  very  important  position  on  a  limiting 
situation.  If  we  denote  by  P(a,b)  the  function  defined  by  Prob{X>x,Y>y}  = 
P(F(x,+^),  F(+°°,y))  Sibuya  (1960)  has  shown  that  the  necessary  and  sufficient 


condition  for  having  limiting  independence  is  that  P(1 -s,  l-s)/s ->•  0  as  s  +  0, 

He  also  showed  that  the  necessary  and  sufficient  condition  for  having  the  diagonal 
case  as  a  limit  situation  is  that  P(l-s,l-s)/s  -+  1  as  s  -*•  0.  With  the  first  re¬ 
sult  we  can  show,  easily,  that  the  maxima  of  the  binormal  distribution  has  inde¬ 
pendence  as  a  limiting  distribution  if  [ p j  <1. 

Also  Geffroy  (1958/59)  showed  that  a  sufficient  condition  for  limiting  inde¬ 
pendence  is  that 


1  +  F (x, y)  -  F(x,wy)  -  F(wx,y) 
1  -  F(x,y) 


-*  0  as  x  -*■  w  and  y  ■+•  w 

x  J  y  ’ 


w  and  w  being  the  right  end  points  of  the  support  of  X  and  Y. 
x  y 

Sibuya  conditions  (and  Geffroy  sufficient  conditions)  are  easy  to  interpret: 
we  have  limiting  independence  if  Prob{X>x,Y>y}  is  a  vanishing  summand  of 
Prob{X>x  or  Y>y}  and  the  diagonal  case  as  limit  if  Prob{X>x,Y>y)  is  the  leading 
summand  of  Prob{X>x  or  Y>y). 

It  is  known  that  a  random  pair  (X,Y)  with  distribution  function  F(x,y)  has 
positive  association  if 

Prob{X<x,Y<y) +  (Prob  X>x,Y>y} 

is  larger  than  or  equal  to  the  corresponding  probabilities  in  the  case  of  indepen¬ 
dence;  intuitively  this  means  that  large  (small)  values  of  one  of  the  variables 
are  associated  with  large  (small)  values  of  the  other.  It  is  immediate  that  this 
reduces  to 


F ( x , y)  >  F(x,+°°)F(+oo,y)  . 

The  inequality,  obtained  from  Boole-Frechet  inequality, 

A(x)A(y)  <  A(x,y)  <  min(A(x) , A(y)) 


shows  that  this  is  the  case  for  bivariate  extreme  pairs,  as  could  be  anticipated. 
This  result  is  due  to  Sibuya  (1960) . 
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The  results  on  correlation  that  follow  and  some  regression  results  to  be  given 
later  continue  to  illuminate  the  situation. 

As  the  covariance  between  X  and  Y  can  be  written,  as  it  is  well  known 

cov(X.Y)  =  /_aJ+°°  [F(x,y)  -  F(x,+co)F(+ o>y)  ]  dxdy 
we  have  in  our  case 

cov(X,Y)  =  -J+  logk(w)  dw 
and  the  correlation  coefficient 


PA  =  -  4-  f_2  logkA(w)  dw  . 


As  k(w)  <  1  we  have  0  •'  p,  as  could  be  expected  from  the  positive  association 
It  is  very  easy  to  show  that  for  the  diagonal  case 


KM  - 

1  +  e 


we  have  p  =  1.  Evidently  the  value  of  p  does  not  identify  the  dependence  function 
(or  the  distribution):  p  is  the  same  for  k(w)  and  k(-w) .  But  p  =  0,  as  k(w)  <  1 
implies  k(w)  =  1,  or  independence.  Writing  now  p  under  the  form 

,  6  f+«  ,  k(w) 

P  =  1  ■  t  /-co  l0«r- 


dw 


we 


see,  analogously,  that  ;  =  1  or  k(w)  >  k^  implies  k(w)  =  k^(w)  .  That  is  the 
diagonal  case. 

Other  common  correlation  coefficients  are 


grade-correlation  coefficient  x  =  12/ 


(l+ew)2(l+k(w))2 


dw  -  3 


#+oo 

difference-sign  correlation  coefficient  i  =  1  =  J  D(w) (l-D(w))  dw 


’  *'rob(Y-X<w)  .  _i_  .  klM  ; 


-  -  •  ••• 


where 
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the  inverse  relation  between  D(w)  and  k(w)  is 

exP(/_oo  DW  dt) 
k(w)  .  - -  . 

1  +  e 

Recall  that  k'(w)  exists  almost  everywhere. 

In  the  case  of  existence  of  a  planar  density,  which  means  the  existence  of 
D'(w),  the  conditions  on  k(w)  are  equivalent  to  the  conditions  that  D(w)  is  a 
distribution  function  with  mean  value  zero 

.0 


(/_„  D(w)  dw  =  /Q  (l-D(w) )  dw) 


and  such  that 

D'(w)  >  D(w)(l-D(w))  . 

Note  that  in  the  independence  case  (k(w)  =  1)  we  have 

1 


D(w)  = 


-w 


(the  logistic  distribution) 


and 


1  +  e 

D'(w)  =  D(w) (l-D(w))  . 


The  differentiable  models;  statistical  decisions:  In  this  section  we  will  suppose 
that  the  assumed  models  have  standard  Gumbel  margins  (or,  equivalently,  that  the 
location  and  dispersion  margin  parameters  are  known)  . 

Up  to  now  only  two  differentiable  models  -  i.e.,  models  with  planar  density, 
with  a  point  exception  in  one  case  -  are  known.  They  appear  in  Gumbel  (1961) . 

One  is  the  logistic  model  so  called  because  its  distribution  function  of  the 
reduced  difference  W  =  Y  -  X  is 


Dg(w)  =  (1+e 


-w/(l-0)) 


-1 


corresponding  to  the  dependence  function 


V">  ■ 


1-0 


1  +  e 


-w 


or  =  0  we  have  independence  (k^(w)  =  1)  and  for  d  =  1  we  have  the  diagonal 


ca.se 


kjt.)  .  "'-"‘I’-p  . 


1  +  e 

which  is  the  only  case  where  we  do  not  have  a  planar  density. 

As  k  (w)  =  kn(-w)  the  margins  are  exchangeable,  as  also  can  be  shown  by  the 
forr  of  the  distribution  function 


.  ,  ,  .  ,  -x/(l-0)  -y/ (1 ->0 . 1 -0 . 

..,(x,y)  =  exp{ - (.e  +  e  )  }  . 


T : ,e  correlation  coefficient  has  the  expression  m(6)  =0(2 -j)  which  increases  from 
(0)  =  (i  to  ,'(1)  =  1;  Kendall's  x  has  the  expression  r(8)  =  9. 

It  can  be  shown  that 

9 

ft  2^1 

sup:  '.,,(x,y)  -  0(x,y)  |  =  (1-2  )  2 

* ,  y 

which  increases  from  0  (at  ■  =0)  to  \  (at  0=1).  It  is,  then,  intuitive  that 

the  distance  between  the  independence  (0  =  0)  and  the  assumed  model  for  0  >  0 

is  small  in  general,  and  for  small  samples  it  will  probably  be  impossible  to  dis 

tinguish  them.  It  is  thus  natural  to  use  a  one-sided  test  of  ft  =  o  vs.  ft  ■>  0. 

2 

Denoting  by  p  (x,y)  the  density  ft  Af./ftx  dy  (pQ(x,y)  =  A'(x)A'(y))  it  is  well 
known  that  the  locally  most  powerful  test  of  0  =  0  vs.  0  >  0  is  given  by  the 
critical  region 

v(x.  ,y.)  >  a 
L1  n 

where 

v(x,y)  =  ~  logp0(x,y) |0  =  Q  . 

In  our  case  we  get 

—  X  -V  —X  —  y  —X  —V  1 

v ( x , y)  =  -x-y+xe  +ye  7  +  (e  ‘  +e  -2)log(e  +e  7)+  - 

e'x+e~y 
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whose  mean  value  is  zero  but  has  infinite  variance  at  6  =  0.  Thus  the  usual  cen¬ 
tral  limit  theorem  is  not  applicable  and  we  have  to  resort  to  simulation. 

A  similar  situation  occurs  for  the  mixed  model  whose  dependence  function  is 

the  (1-6,0)  mixture  of  the  dependence  func.^rn  kg(w)  =  1  (independence)  and 

w  wu 

£  0 

k,  (w)  =  1  -  - *-  for  which  k,  (w)  =  1  -  0- - .  In  this  case  we  always 

1  ,,  w.  2  b  f,  w.2  J 

(1+e  )  (1+e  ) 

have  a  planar  density.  The  distribution  function  of  the  difference  W  =  Y  -  X  is 


D,(w) 

o 


w  ..  w  2 
e  (1+e  )  -6 


,  w  w,.  2  w 
1+e  (1+e  )  -0e 


For  0=0  we  have  independence,  as  noted,  but  for  6  =  1  we  have  dependence  but 
not  the  diagonal  case.  The  Boole-Frdchet  inequality  shows  that  the  domain  [0,1] 
for  0  cannot  be  enlarged. 

The  distribution  function  is 


AJx.y)  =  exp-{e~x+e"y-  )  =  A(x) A(y) exp(— ) 


x  y 
e  +e/ 


x  y 
e  +e 


and  the  pair  is  exchangeable  as  can  be  seen,  also,  because  k„(w)  =  kn(-w) .  The 

correlation  coefficient  is 

p(0)  =  -^-(arc  cos(l-9/2))'  , 

it 

increasing  from  p(0)  =  0  to  p ( 1)  =  2/3.  We  have  also 

4 

SUP I  Ag(x,y)  -  AQ(x,y)|  =  jL  (1-  i) 0 

x,y 

3  4 

which  increases  from  0  (at  0  =  0)  to  3  /4  =  0.106.  The  smaller  variation  of  the 

correlation  coefficient  and  of  the  distance  shows  that  the  deviation  from  indepen¬ 
dence  is  smaller  and  most  difficult  to  detect. 

Once  more,  the  locally  most  powerful  test  of  0  =  0  vs.  0  >  0  leads  to  the 


critical  region 


J?  v(x. ,y .)  2  a 
‘■l  n 


The  mean  value  is  zero  but  the  variance  is  also  infinite.  Those  difficulties 
show  that  we  must  apply  the  usual  methods  of  data  analysis,  although  inefficient. 

The  combination  of  the  use  of  correlation  coefficients  (product -moment , 
difference-sign  and  grade)  and  also  the  step  and  quadrants  method  described 
briefly  in  Tiago  de  Oliveira  (1975,  1980)  show  that  the  most  efficient  of  all  is 
the  (product -moment)  correlation  coefficient,  for  testing  =  0  vs.  *j  0  in  both 
models.  Naturally,  until  further  advances  appear,  it  seems  natural  to  use  corre¬ 
lation  coefficients  to  estimate  0.  Note  that  all  those  methods  are  independent 
of  the  margin  paiameters.  For  confidence  intervals,  owing  to  the  difficulty  of 
getting  the  variance  of  p  as  a  function  of  0  (in  both  models)  it  may,  even,  be 
useful  to  use  the  quadrants  method  which  estimates  the  probability  of  the  compo¬ 
nents  of  the  random  pair  to  be  both  larger  or  both  smaller  than  the  medians  of  the 
margins,  by  its  observed  frequency.  As  the  margin  medians,  in  reduced  forn,  are 

V:  =  -log  log2,  the  probability  already  referred  to  has  the  expression  2.\a(p.i.) 

1-0 

which  amounts  to  p(8)  =  exp(log2x(l-2  ))  in  the  logistic  model  and  to 
p(9)  =  *5x2^2  for  the  mixed  model.  9  is  estimated  by  p(6*)  =  N/n  where  N  is  the 
number  of  observed  pairs  whose  components  are  both  smaller  or  both  larger  than 
the  sample  margin  medians  and  it  is  known  that 


is  asymptotically  normal. 

No  other  statistical  decision  problems  (such  as  regression  analysis,  discrimi¬ 
nation  and  forecasting)  have  been  dealt  with  for  both  models;  as  said  separation 
of  the  two  models  is  now  under  study. 
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The  non-differentiable  models;  statistical  decisions:  The  biextremal  and  the 
Gumbel  models  were  the  only  ones  considered  until  recently;  now  there  is  a  third 
model,  the  natural,  which,  in  some  way,  generalizes  the  biextremal  model  (see 
Tiago  de  Oliveira  (1970,  1971,  1974,  1975,  1975',  1980)). 

The  biextremal  model  appears  naturally  in  extremal  processes  (see  Tiago  de 
Oliveira  (1968)  and  references  therein).  One  way  of  introducing  it  directly 
through  the  max-technique  is  to  consider  a  standard  Gumbel  independent  pair 
(X, Z)  and  take  the  new  pair  (X,Y)  with 

Y  =  max(X+log0,Z+log(l-0) )  ,  0  <  0  <  1. 

It  has  the  distribution  function 

AQ(x,y)  =  exp{-max(e'x+(l-0)e~y,e~y) } 


and  the  dependence  function 


j.  ^  _  l-0+max(6,  eW)  _  ^  min(0,  ew) 


1  +  e 


1  +  e 


As  kn(w)  ^  k.(-w)  the  random  variables  are  not  exchangeable. 

The  distribution  function  of  the  (reduced)  difference  is  D  (w)  =  0  if  w<log6 
— w  —  1 

and  D  (w)  =  (l+(l-0)e  )  if  w^log0  with  a  jump  of  0  at  log0.  It  is  immediate 


that 


Prob{Y2X+log0}  =  1  , 


and,  so, 
bility  0. 
have 


a  singular  part  is  concentrated  at  the  line  y  =  x  +  log0,  with  proba- 
For  0=0  and  0  =  1  we  obtain  independence  and  the  diagonal  case.  We 

1+6 

sup|/\Q(x,y)  -  AQ(x,y)j  =  0(1+0)  6 


which,  as  for  the  logistic  model,  increases  from  0  (at  0  =  0)  to  1  (at  0=1). 

As  Prob(Y-X2log0} 

and  so  Prob{min(Yi-Xi) 2log0)  =  1 
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it  is  natural  to  estimate  u  by 

y.  -x. 

0*  =  min(e  1  1,1) 

whose  distribution  function  is  Prob{0*<z}  =  0  if  2  <  0,  =1-  (— ^— ^  0)n  if 

t1  <  :  <  1  ,  =  1  if  z  >  1,  the  variance  of  0  being  asymptotic  to  2(l-0)n/n^  if 

2 

■*  >  0  and  asymptotic  to  1/n  if  b  =  0 .  A  natural  test  of  independence,  at  sig¬ 
nificance  level  a,  using  0*  is  to  accept  0  =  0  vs .  6  >  0  if  0*  <  a”^n-l.  The 
correlation  coefficient  of  the  biextremal  model  is 


P(0) 


6_  fe  iogt 

2-0  1-t 


dt 


77 

increasing  from  , (0)  =  0  to  p(l)  =  1. 

In  Tiago  de  Oliveira  (1974)  we  gave  the  expression  of  the  general  regression 
of  Y  in  X  and  X  on  Y .  It  was  shown  already,  in  regard  to  mean  square  error,  that 
linear  regression  is  a  good  approximation  to  the  general  regression.  Although 
linear  regression  and  general  regression  curves  behave  very  differently  for  very 
large  and  very  small  values  of  x  (or  y) ,  this  can  be  explained  because  of  the 
positive  association  and  of  the  fact  that  the  half-lines  where  they  are  very 
distinct  have  a  very  low  probability  and,  thus,  a  very  small  weight  in  the  mean 
square  error. 

Gumbel  model  has  the  distribution  function 

AQ(x,y)  =  exp{-[e'x+e"y-emin(e“x,e"y)] } 


with  the  dependence  function 


k .(«)  -  i-e  “v-a.*") 

1  +  e 


and  as  kn(w)  =  ka(-w)  the  random  variables  X  and  Y  are  exchangeable.  The 

U  C7 

distribution  function  of  the  (reduced)  difference  is 


D0(w)  = 


1-0 

1-0+e 


if  w  <  0  ,  = 


if  w  >  0 


1-0+e 
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with  a  jump  of  0 / (2-6)  at  w  =  0,  giving  thus  the  probability  P(Y=X).  The  depen¬ 
dence  function  is  the  mixture  (1-0,6)  of  independence  (k^fw)  =  1)  and  the  diagonal 

case  (k  (w)  =  -  )  .  For  this 

1  +  eW 

sup|A0(x,y)  -  A0(x,y)|  =  ^(1-0/2) 2/9 
x,y 

which,  also,  increases  from  0  (at  0  =  0)  to  h  (at  0=1).  _ 

This  model  is  a  transformation  for  Gumbel  margins  of  the  Marshall  and  Olkin 

(1967)  bivariate  exponential  model.  If  we  denote  by  nf  the  number  of  points 

-x.  -y.  n 

(x.,y.)  with  x.  =  y.  and  by  T  =  l/n£max(e  ,e  )  the  maximum  likelihood  esti- 
i'i  i'i  ■'n 

raator,  it  is  then  given  by 

0  =  (T  -1+7(T  -1)  2+4f  T  )/2T 
n  n  n  n  '  n 

taking  0  =  0  if  the  expression  is  negative.  0*,  not  truncated,  is  asymptotically 

normal  with  mean  value  0  and  variance 

0(1-0)  (2-9) 
nTTT0) 

Note  that  and  T^  are  asymptotically  independent  and  that 

_  (2-0) f  -0 

(^n  - 2— )  ,  vfi((2-0/T_-l)) 

/20(l-0) 

is  asymptotically  a  binormal  pair  with  standard  margins.  In  particular  we  see 
that 

var  (f  j  .  I  U-o)(2-«2 

n  n 

is  zero  at  0  =  0  as  (X,Y),  being  independent  form  an  absolutely  continuous  pair 
and  at  1  as  P(X=Y)  =  1.  The  estimator 

0  =  (T  -1+  /(T  -l)2+4f  T  )/2T 

n  '»'■  n  n  n"  n 

is  asymptotically  normal  with  mean  value  0  and  variance  6(1-0)  (2-0)/ (l  +  0)n; 


,  -  V  - 
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evidently  if  6*  <  0  or  0*  >  1  we  must  truncate. 

A 

To  test  independence  (0=0),  as  =  0  and  the  variance  of  0  is  null,  we 
can  use  T  ,  ^(2‘T^-l)  being  asymptotically  standard  normal. 

The  correlation  coefficient  is 


pee)  =  /®  l0f.CY— -  dt 

TT 

increasing  from  p(0)  =  0  to  p(l)  =1.  As  for  the  biextremal  model  the  linear 
regression  is  a  very  good  approximation  to  the  general  one,  in  the  same  sense 
as  before  (see  Tiago  de  Oliveira  (1974)). 

Let  us  consider  the  natural  model  described  in  Tiago  de  Oliveira  (1982).  If 
we  take  independent  random  (reduced)  Gumbel  variables  Z  and  T  and  consider  a  new 
random  pair  (X,Y)  with  X  =  max(Z-a,T-b) ,  Y  =  max(Z-c,Y-d)  with  a,  b,  c,  d  >  0 

—  3.  — b  —  c  —  (1 

such  that  the  margins  are  standard  we  get  e  +e  =  e  +e  =1.  Then  we  have 
A(x,y)  =  P(X<x,Ysy)  =  exp{-(e-X+e~y)k(y-x) } 

k(w)  =  (max(e"a,e~c_w)  -  max(e~^),e~cl  W))/(l+e~W)  , 
if  a  -  c  <  b  -  d,  using  all  the  introduced  parameters.  The  random  points  as 
a-c<y-x<b-d,  are  contained  in  a  strip  parallel  to  the  first  diagonal, 
imposing  thus  a  strong  stochastic  relation  between  X  and  Y  if  the  bounds  a  -  c 
and  b  -  d  are  not  infinite.  Asa-c<0<b-d  this  strip  contains  the  origin. 
Let  us  now  introduce  the  parameters  a,  B  >  0,  such  that  a  -  c  =  a  +  log(l-e ~^)  =  -a 
and  b  -  d  =  -d  -  log(l-e”a)  =  g  .  Note  that  a  ^  0  or  6  >  0  imply  e"a+e >  1. 

The  final  expression  of  the  dependence  function  is 


W”’  ' 


1  +  e 


if  w  <  -a 


l-e~^+(l-e~ot)e"w 

(l-e"a*6)(l+e‘w) 


if  -a  s  w  <  g 
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1  +  e 


-w 


if  w  >  3 


Note  that  the  left  and  right  tails  of  k(w)  coincide  with  the  ones  of  the  diagonal 

a,  6 

case.  As  said  before  the  case  a  =  3  =  +0°  corresponds  to  independence  and 

a  =  3  =  0  to  the  diagonal  case.  The  exchange  of  a  and  8  corresponds  to  the 

exchange  of  X  and  Y.  For  a  =  -log0  and  8  =  +°°  (0  <  6  <  1)  we  get  the  biextremal 

model  and  its  dual  (exchange  of  X  and  Y)  is  obtained  for  a  =  +«,  8  =  -log0. 

It  is  immediately  shown  that 

D  „(w)  =  Prob{Y-X£w}  =0  if  w  <  -a 

a,  3 


1  -  e~B 

7  -8  /"■ i  -ciT  ^w 

1-e  p+(l-e  ) e 


if  -a  <  w  < 


=  1 


if  w  >  8 


e*-l 


with  jumps  of  - s —  at  -a 

J  r  _a+8  . 


and 


ea-l 

ea+8-l 


at  8. 


e  -1 

The  correlation  coefficient  has  the  expression 


and  as 


p(cx,6)  =  -  -^2  !+_Z  lo8ka  gM  dw 

IT  ’ 


.  6  c+oo  ,  max(l,e  )  , 

1  - - r  j  log -  *  dw 

2  ^  -OO  °  w 
t r  1  +  e 


we  get  by  subtracting  and  simple  algebra 


p(a.B)  1°8  - -f-^r  to 

tt  (1-e  )raax(l,e  ) 


=  1  -  4  <-  T  * 

tt  1-e 

r(e“-wi-f6)  io8(i>t)  dt} 

'(1-e  ^e6-!)  t  dt}  • 
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It  is  evident  p(i,i<)  -*  1  as  u,  B  -*■  0  and  p(a,  B)  +  0  as  a,6  >  +c~. 

The  linear  regression  with  reduced  margins  is  civen  by  the  straight  lines 
y  -  Y  =  p(ot,  3)  (x-y)  and  x  -  y  =  p(<.,B)(y-r)  . 

For  the  general  regression,  as  the  interchange  of  X  and  Y  is  equivalent  to  the 
interchange  of  '*  and  b  it  is  sufficient  to  compute  the  regression  curve 


y,*,b(x)  = x  +  s  - 


1  e'a  x  r[e&(e°‘-l)/(ee-e"  *)]e'x  -t 

- expC— ~ -  e  )  x  -a,  .  -  6  -at,,  -x  — r— 

l  e-o-6  r  e3_e~ot  J  [(1-e  )/(e  -e  )]e  t 


1  -  e 


Analogously  to  the  situation  for  the  biextremal  model  we  can  expect  that 

regression  lines  are  a  good  approximation  -  in  the  same  sense  -  to  the  general 

regression  curve.  When  the  margins  are  reduced,  as  —x  and  P  are  the  bounds  of 

the  support  of  D  ,(w)  we  can  naturally  estimate  a  and  6  from  the  w.  =  y.  -  x. . 

Cl,  p  I'll 

As  -a  <  min(w.)  •  max(w.)  <  B  and  -cc  <  0  <  B  the  estimators  of 
i  i 

a  and  g  -  although  biased  (both  in  a  one-sided  manner)  are 

a*  =  -  min(0,min(wi))  (<  a) 

B*  =  max(0,  min(w^))  (<  B) 


As  Prob{a*  =  0}  =  (1  -  D  (0))  and  Prob{g*  =  0) 

a,  B 


Da,3(0) 


and 


2e^-l-e^~a 

0  <  D  CO)  =  *■  -  <1  if  a,  8  >  0 

cc,  p 


2(e8-e_a) 

we  see  that  the  probabilities  converge  to  zero  and,  thus,  the  estimators  are 
consistent.  If  ot  =  P  =  0  we  have  the  diagonal  case  and  all  the  sample  points  are 
in  the  first  diagonal . 


Remarks  on  the  non-parametric  estimation  of  the  dependence  function:  The  fact 
that  the  set  of  dependence  functions  (k(w)}  is  a  convex  set  could  suggest  estima¬ 
ting  the  dependence  function  of  the  data  under  consideration  by  an  average 
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k*(w)  s  i  £ ”  ks(w/wi),  where  ks(w/w^)  is  a  special  and  convenient  dependence 
function  and  w^  is  the  difference  y^  -  x^  for  the  observed  pair  (x^,y^),  evidently 
with  standard  margins.  By  Khintchine's  theorem  we  know  that  k*(w)  converges 


almost  surely,  for  every  w,  to 


/*„  ks(w/t)  d D(t) 


which  should  be  equal  to 


k(w)  = 


exp(JW  D(t)  dt) 


i  " 

1  +  e 


The  resulting  integral  equation 


(l+eW)  k  (w/t)  dD(t)  =  exptj*^  D(t)  dt) 


does  not  seem  to  be  solvable  for  ks(w/v),  for  all  admissible  (D(w)}.  This  temp¬ 
tation  must  be  discarded. 

We  can  try  to  estimate  directly  D(w)  by  the  usual  sample  distribution  function 
i-  H(w-w^)  where  H(w)  is  the  Heavside  jump  function  at  w  =  0  (H(w)  =  0  if 
w  <  0,  H(w)  =  1  if  w  >  0) .  It  is  easy  to  see  that  we  get,  then. 


k* (w)  = 


exP(-f  li  («-wi)+) 


1  +  e 


where  (w)  +  =  0  if  w  <  0  and  (w)+  =  w  if  w  >  0.  But  we  can  see  that  although 
k*(-oo)  =  l  we  have  k*(+°°)  =  e~w  1 .  A  possible  modification  is  to  take 


k*(w)  = 


exP(^  (w~wi)+) 


1  +  e 


which  converges  a.s.  to 


expCj^  D(t)  dt) 


1  +  e 

as  n  -*•  <*>,  because  w  ■+  0.  We  have,  already,  then  k**(-°°)  =  k**(+°°)  =  1 


but  k  is  not  yet  a  dependence  function. 

We  could,  owing  to  the  central  position  of  the  logistic  distribution,  as 
associated  with  independence,  and  also  due  to  its  quite  good  behavior,  try  to 


estimate  D(w)  by 


D*  (w) 


_  1  rn 
=  n  ^1 


with  6  ■>  0.  In  fact,  we  do  not  obtain  a  D(w)  function  and,  so,  the  simpler 

estimator,  up  to  now,  is  k**(w). 

The  area  of  non-parametric  estimation  of  k(w)  or  D(w)  by  k-  or  D-  functions 
seems,  thus,  still  completely  unexplored. 


Remarks  about  the  general  situation:  In  general  we  do  not  have  standard  margins 
but  margins  with  location  and  dispersion  parameters.  In  that  case,  which  seems 
natural  is  to  estimate,  independently,  the  margin  parameters  5x  (>0)  and 

AAA  A 

\  , a  (>0)  by  its  ML -estimators  X  ,6  ,X  and  5  ,  then  obtain  the  "estimated" 
y  y  x  x  y  y 

reduced  values  x.^  =  (x^  -  X^/^  and  =  (y^  -  X  )/ 6^,  and,  finally,  the  "esti¬ 
mated"  reduced  difference  w^  =  y^  -  x^  and  proceed  as  before,  with  standard  mar¬ 
gins. 

As  a  whole,  in  the  differentiable  cases  we  can  expect  good  behavior  -  see 
the  paper  on  5-method  by  Tiago  de  Oliveira  (1981)  -  but  in  the  non-differentiable 
cases  the  situation  can  be  more  difficult,  as  is,  especially,  the  case  for  the 
Gumbel  model  where,  with  probability  one  we  do  not  have  x.  =  y. .  In  this  case  the 

A  A 

use  of  T  is  suggested,  although  it  is  much  less  efficient  than  the  case  of  f^. 


This  is,  thus,  another  open  area  of  study. 
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